Overview

Dataset statistics

Number of variables15
Number of observations838
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory98.3 KiB
Average record size in memory120.2 B

Variable types

Numeric7
Categorical8

Alerts

Name has a high cardinality: 838 distinct valuesHigh cardinality
Age is highly overall correlated with age_labelsHigh correlation
SibSp is highly overall correlated with FamilySize and 1 other fieldsHigh correlation
Parch is highly overall correlated with FamilySize and 1 other fieldsHigh correlation
Fare is highly overall correlated with FamilySizeHigh correlation
FamilySize is highly overall correlated with SibSp and 3 other fieldsHigh correlation
age_labels is highly overall correlated with AgeHigh correlation
Survived is highly overall correlated with Sex and 1 other fieldsHigh correlation
Sex is highly overall correlated with Survived and 1 other fieldsHigh correlation
Embarked is highly overall correlated with EmbarkedIndexHigh correlation
SexIndex is highly overall correlated with Survived and 1 other fieldsHigh correlation
EmbarkedIndex is highly overall correlated with EmbarkedHigh correlation
IsAlone is highly overall correlated with SibSp and 2 other fieldsHigh correlation
PassengerId is uniformly distributedUniform
Name is uniformly distributedUniform
PassengerId has unique valuesUnique
Name has unique valuesUnique
SibSp has 568 (67.8%) zerosZeros
Parch has 638 (76.1%) zerosZeros
Fare has 13 (1.6%) zerosZeros

Reproduction

Analysis started2023-07-27 09:17:24.666427
Analysis finished2023-07-27 09:17:41.268125
Duration16.6 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

PassengerId
Real number (ℝ)

UNIFORM  UNIQUE 

Distinct838
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean447.13484
Minimum1
Maximum891
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2023-07-27T09:17:41.450064image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile44.85
Q1225.25
median448.5
Q3669.75
95-th percentile846.15
Maximum891
Range890
Interquartile range (IQR)444.5

Descriptive statistics

Standard deviation258.28327
Coefficient of variation (CV)0.57764066
Kurtosis-1.1985332
Mean447.13484
Median Absolute Deviation (MAD)222.5
Skewness-0.012789643
Sum374699
Variance66710.246
MonotonicityStrictly increasing
2023-07-27T09:17:41.726903image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
560 1
 
0.1%
589 1
 
0.1%
590 1
 
0.1%
591 1
 
0.1%
592 1
 
0.1%
593 1
 
0.1%
594 1
 
0.1%
595 1
 
0.1%
596 1
 
0.1%
Other values (828) 828
98.8%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
891 1
0.1%
890 1
0.1%
888 1
0.1%
887 1
0.1%
886 1
0.1%
885 1
0.1%
884 1
0.1%
883 1
0.1%
882 1
0.1%
881 1
0.1%

Survived
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0
534 
1
304 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters838
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 534
63.7%
1 304
36.3%

Length

2023-07-27T09:17:42.502450image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-27T09:17:42.886811image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0 534
63.7%
1 304
36.3%

Most occurring characters

ValueCountFrequency (%)
0 534
63.7%
1 304
36.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 838
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 534
63.7%
1 304
36.3%

Most occurring scripts

ValueCountFrequency (%)
Common 838
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 534
63.7%
1 304
36.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 534
63.7%
1 304
36.3%

Pclass
Categorical

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
3
460 
1
207 
2
171 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters838
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row1
3rd row3
4th row1
5th row3

Common Values

ValueCountFrequency (%)
3 460
54.9%
1 207
24.7%
2 171
 
20.4%

Length

2023-07-27T09:17:43.258996image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-27T09:17:43.608147image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
3 460
54.9%
1 207
24.7%
2 171
 
20.4%

Most occurring characters

ValueCountFrequency (%)
3 460
54.9%
1 207
24.7%
2 171
 
20.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 838
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 460
54.9%
1 207
24.7%
2 171
 
20.4%

Most occurring scripts

ValueCountFrequency (%)
Common 838
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 460
54.9%
1 207
24.7%
2 171
 
20.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 460
54.9%
1 207
24.7%
2 171
 
20.4%

Name
Categorical

HIGH CARDINALITY  UNIFORM  UNIQUE 

Distinct838
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
Braund, Mr. Owen Harris
 
1
de Messemaeker, Mrs. Guillaume Joseph (Emma)
 
1
Gilinski, Mr. Eliezer
 
1
Murdlin, Mr. Joseph
 
1
Rintamaki, Mr. Matti
 
1
Other values (833)
833 

Length

Max length82
Median length51
Mean length26.386635
Min length12

Characters and Unicode

Total characters22112
Distinct characters59
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique838 ?
Unique (%)100.0%

Sample

1st rowBraund, Mr. Owen Harris
2nd rowCumings, Mrs. John Bradley (Florence Briggs Thayer)
3rd rowHeikkinen, Miss. Laina
4th rowFutrelle, Mrs. Jacques Heath (Lily May Peel)
5th rowAllen, Mr. William Henry

Common Values

ValueCountFrequency (%)
Braund, Mr. Owen Harris 1
 
0.1%
de Messemaeker, Mrs. Guillaume Joseph (Emma) 1
 
0.1%
Gilinski, Mr. Eliezer 1
 
0.1%
Murdlin, Mr. Joseph 1
 
0.1%
Rintamaki, Mr. Matti 1
 
0.1%
Stephenson, Mrs. Walter Bertram (Martha Eustis) 1
 
0.1%
Elsbury, Mr. William James 1
 
0.1%
Bourke, Miss. Mary 1
 
0.1%
Chapman, Mr. John Henry 1
 
0.1%
Van Impe, Mr. Jean Baptiste 1
 
0.1%
Other values (828) 828
98.8%

Length

2023-07-27T09:17:44.111976image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
mr 501
 
15.0%
miss 156
 
4.7%
mrs 121
 
3.6%
william 59
 
1.8%
john 40
 
1.2%
master 36
 
1.1%
henry 32
 
1.0%
james 23
 
0.7%
charles 22
 
0.7%
thomas 21
 
0.6%
Other values (1425) 2340
69.8%

Most occurring characters

ValueCountFrequency (%)
2515
 
11.4%
r 1827
 
8.3%
e 1560
 
7.1%
a 1538
 
7.0%
i 1209
 
5.5%
n 1200
 
5.4%
s 1192
 
5.4%
M 1044
 
4.7%
l 980
 
4.4%
o 932
 
4.2%
Other values (49) 8115
36.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14285
64.6%
Uppercase Letter 3369
 
15.2%
Space Separator 2515
 
11.4%
Other Punctuation 1684
 
7.6%
Close Punctuation 123
 
0.6%
Open Punctuation 123
 
0.6%
Dash Punctuation 13
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 1827
12.8%
e 1560
10.9%
a 1538
10.8%
i 1209
8.5%
n 1200
8.4%
s 1192
8.3%
l 980
 
6.9%
o 932
 
6.5%
t 616
 
4.3%
h 476
 
3.3%
Other values (16) 2755
19.3%
Uppercase Letter
ValueCountFrequency (%)
M 1044
31.0%
A 232
 
6.9%
J 206
 
6.1%
H 177
 
5.3%
S 174
 
5.2%
C 160
 
4.7%
E 155
 
4.6%
W 133
 
3.9%
B 132
 
3.9%
L 119
 
3.5%
Other values (15) 837
24.8%
Other Punctuation
ValueCountFrequency (%)
. 839
49.8%
, 838
49.8%
' 6
 
0.4%
/ 1
 
0.1%
Space Separator
ValueCountFrequency (%)
2515
100.0%
Close Punctuation
ValueCountFrequency (%)
) 123
100.0%
Open Punctuation
ValueCountFrequency (%)
( 123
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17654
79.8%
Common 4458
 
20.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 1827
 
10.3%
e 1560
 
8.8%
a 1538
 
8.7%
i 1209
 
6.8%
n 1200
 
6.8%
s 1192
 
6.8%
M 1044
 
5.9%
l 980
 
5.6%
o 932
 
5.3%
t 616
 
3.5%
Other values (41) 5556
31.5%
Common
ValueCountFrequency (%)
2515
56.4%
. 839
 
18.8%
, 838
 
18.8%
) 123
 
2.8%
( 123
 
2.8%
- 13
 
0.3%
' 6
 
0.1%
/ 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22112
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2515
 
11.4%
r 1827
 
8.3%
e 1560
 
7.1%
a 1538
 
7.0%
i 1209
 
5.5%
n 1200
 
5.4%
s 1192
 
5.4%
M 1044
 
4.7%
l 980
 
4.4%
o 932
 
4.2%
Other values (49) 8115
36.7%

Sex
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
male
556 
female
282 

Length

Max length6
Median length4
Mean length4.673031
Min length4

Characters and Unicode

Total characters3916
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmale
2nd rowfemale
3rd rowfemale
4th rowfemale
5th rowmale

Common Values

ValueCountFrequency (%)
male 556
66.3%
female 282
33.7%

Length

2023-07-27T09:17:45.024184image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-27T09:17:45.571362image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
male 556
66.3%
female 282
33.7%

Most occurring characters

ValueCountFrequency (%)
e 1120
28.6%
m 838
21.4%
a 838
21.4%
l 838
21.4%
f 282
 
7.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3916
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1120
28.6%
m 838
21.4%
a 838
21.4%
l 838
21.4%
f 282
 
7.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 3916
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1120
28.6%
m 838
21.4%
a 838
21.4%
l 838
21.4%
f 282
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3916
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1120
28.6%
m 838
21.4%
a 838
21.4%
l 838
21.4%
f 282
 
7.2%

Age
Real number (ℝ)

Distinct87
Distinct (%)10.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.950274
Minimum0.42
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2023-07-27T09:17:45.837121image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0.42
5-th percentile5.85
Q122
median29.699118
Q335
95-th percentile54.15
Maximum80
Range79.58
Interquartile range (IQR)13

Descriptive statistics

Standard deviation13.0862
Coefficient of variation (CV)0.43693088
Kurtosis0.95579935
Mean29.950274
Median Absolute Deviation (MAD)6.3008824
Skewness0.443171
Sum25098.33
Variance171.24862
MonotonicityNot monotonic
2023-07-27T09:17:46.129385image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
29.69911765 159
 
19.0%
24 27
 
3.2%
22 26
 
3.1%
30 25
 
3.0%
28 25
 
3.0%
18 24
 
2.9%
19 24
 
2.9%
25 23
 
2.7%
21 21
 
2.5%
36 20
 
2.4%
Other values (77) 464
55.4%
ValueCountFrequency (%)
0.42 1
 
0.1%
0.67 1
 
0.1%
0.75 2
 
0.2%
0.83 2
 
0.2%
0.92 1
 
0.1%
1 6
0.7%
2 10
1.2%
3 5
0.6%
4 10
1.2%
5 4
 
0.5%
ValueCountFrequency (%)
80 1
 
0.1%
74 1
 
0.1%
71 2
0.2%
70.5 1
 
0.1%
70 2
0.2%
66 1
 
0.1%
65 3
0.4%
64 2
0.2%
63 2
0.2%
62 4
0.5%

SibSp
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.52863962
Minimum0
Maximum8
Zeros568
Zeros (%)67.8%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2023-07-27T09:17:46.386981image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.09676
Coefficient of variation (CV)2.0746837
Kurtosis17.012651
Mean0.52863962
Median Absolute Deviation (MAD)0
Skewness3.5952151
Sum443
Variance1.2028825
MonotonicityNot monotonic
2023-07-27T09:17:46.601144image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 568
67.8%
1 200
 
23.9%
2 25
 
3.0%
4 18
 
2.1%
3 16
 
1.9%
8 6
 
0.7%
5 5
 
0.6%
ValueCountFrequency (%)
0 568
67.8%
1 200
 
23.9%
2 25
 
3.0%
3 16
 
1.9%
4 18
 
2.1%
5 5
 
0.6%
8 6
 
0.7%
ValueCountFrequency (%)
8 6
 
0.7%
5 5
 
0.6%
4 18
 
2.1%
3 16
 
1.9%
2 25
 
3.0%
1 200
 
23.9%
0 568
67.8%

Parch
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.37947494
Minimum0
Maximum6
Zeros638
Zeros (%)76.1%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2023-07-27T09:17:46.812205image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.80865019
Coefficient of variation (CV)2.1309713
Kurtosis10.271277
Mean0.37947494
Median Absolute Deviation (MAD)0
Skewness2.8199432
Sum318
Variance0.65391514
MonotonicityNot monotonic
2023-07-27T09:17:47.020339image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 638
76.1%
1 114
 
13.6%
2 71
 
8.5%
5 5
 
0.6%
3 5
 
0.6%
4 4
 
0.5%
6 1
 
0.1%
ValueCountFrequency (%)
0 638
76.1%
1 114
 
13.6%
2 71
 
8.5%
3 5
 
0.6%
4 4
 
0.5%
5 5
 
0.6%
6 1
 
0.1%
ValueCountFrequency (%)
6 1
 
0.1%
5 5
 
0.6%
4 4
 
0.5%
3 5
 
0.6%
2 71
 
8.5%
1 114
 
13.6%
0 638
76.1%

Fare
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct244
Distinct (%)29.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.736773
Minimum0
Maximum512.3292
Zeros13
Zeros (%)1.6%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2023-07-27T09:17:47.267907image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7.225
Q17.925
median14.4542
Q331.275
95-th percentile113.275
Maximum512.3292
Range512.3292
Interquartile range (IQR)23.35

Descriptive statistics

Standard deviation50.354437
Coefficient of variation (CV)1.5381613
Kurtosis32.994321
Mean32.736773
Median Absolute Deviation (MAD)6.9459
Skewness4.7494031
Sum27433.416
Variance2535.5693
MonotonicityNot monotonic
2023-07-27T09:17:47.553034image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.05 41
 
4.9%
13 40
 
4.8%
7.8958 37
 
4.4%
7.75 29
 
3.5%
26 27
 
3.2%
10.5 23
 
2.7%
7.925 18
 
2.1%
7.775 16
 
1.9%
7.2292 15
 
1.8%
8.6625 13
 
1.6%
Other values (234) 579
69.1%
ValueCountFrequency (%)
0 13
1.6%
4.0125 1
 
0.1%
5 1
 
0.1%
6.2375 1
 
0.1%
6.4375 1
 
0.1%
6.45 1
 
0.1%
6.4958 2
 
0.2%
6.75 1
 
0.1%
6.8583 1
 
0.1%
6.95 1
 
0.1%
ValueCountFrequency (%)
512.3292 3
0.4%
263 4
0.5%
262.375 1
 
0.1%
247.5208 2
0.2%
227.525 4
0.5%
221.7792 1
 
0.1%
211.5 1
 
0.1%
211.3375 3
0.4%
164.8667 2
0.2%
153.4625 3
0.4%

Embarked
Categorical

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
S
616 
C
159 
Q
63 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters838
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowS
2nd rowC
3rd rowS
4th rowS
5th rowS

Common Values

ValueCountFrequency (%)
S 616
73.5%
C 159
 
19.0%
Q 63
 
7.5%

Length

2023-07-27T09:17:47.812860image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-27T09:17:48.056053image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
s 616
73.5%
c 159
 
19.0%
q 63
 
7.5%

Most occurring characters

ValueCountFrequency (%)
S 616
73.5%
C 159
 
19.0%
Q 63
 
7.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 838
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 616
73.5%
C 159
 
19.0%
Q 63
 
7.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 838
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 616
73.5%
C 159
 
19.0%
Q 63
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 616
73.5%
C 159
 
19.0%
Q 63
 
7.5%

SexIndex
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
556 
1.0
282 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row1.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 556
66.3%
1.0 282
33.7%

Length

2023-07-27T09:17:48.263321image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-27T09:17:48.511888image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0.0 556
66.3%
1.0 282
33.7%

Most occurring characters

ValueCountFrequency (%)
0 1394
55.4%
. 838
33.3%
1 282
 
11.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1394
83.2%
1 282
 
16.8%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1394
55.4%
. 838
33.3%
1 282
 
11.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1394
55.4%
. 838
33.3%
1 282
 
11.2%

EmbarkedIndex
Categorical

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
616 
1.0
159 
2.0
63 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 616
73.5%
1.0 159
 
19.0%
2.0 63
 
7.5%

Length

2023-07-27T09:17:48.729907image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-27T09:17:48.972300image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0.0 616
73.5%
1.0 159
 
19.0%
2.0 63
 
7.5%

Most occurring characters

ValueCountFrequency (%)
0 1454
57.8%
. 838
33.3%
1 159
 
6.3%
2 63
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1454
86.8%
1 159
 
9.5%
2 63
 
3.8%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1454
57.8%
. 838
33.3%
1 159
 
6.3%
2 63
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1454
57.8%
. 838
33.3%
1 159
 
6.3%
2 63
 
2.5%

FamilySize
Real number (ℝ)

Distinct9
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9081146
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2023-07-27T09:17:49.179969image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile6
Maximum11
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.6078975
Coefficient of variation (CV)0.84266297
Kurtosis8.800414
Mean1.9081146
Median Absolute Deviation (MAD)0
Skewness2.6839454
Sum1599
Variance2.5853343
MonotonicityNot monotonic
2023-07-27T09:17:49.382944image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1 502
59.9%
2 155
 
18.5%
3 95
 
11.3%
4 28
 
3.3%
6 22
 
2.6%
5 12
 
1.4%
7 12
 
1.4%
8 6
 
0.7%
11 6
 
0.7%
ValueCountFrequency (%)
1 502
59.9%
2 155
 
18.5%
3 95
 
11.3%
4 28
 
3.3%
5 12
 
1.4%
6 22
 
2.6%
7 12
 
1.4%
8 6
 
0.7%
11 6
 
0.7%
ValueCountFrequency (%)
11 6
 
0.7%
8 6
 
0.7%
7 12
 
1.4%
6 22
 
2.6%
5 12
 
1.4%
4 28
 
3.3%
3 95
 
11.3%
2 155
 
18.5%
1 502
59.9%

IsAlone
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
1
502 
0
336 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters838
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
1 502
59.9%
0 336
40.1%

Length

2023-07-27T09:17:49.597873image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-27T09:17:49.862174image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
1 502
59.9%
0 336
40.1%

Most occurring characters

ValueCountFrequency (%)
1 502
59.9%
0 336
40.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 838
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 502
59.9%
0 336
40.1%

Most occurring scripts

ValueCountFrequency (%)
Common 838
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 502
59.9%
0 336
40.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 502
59.9%
0 336
40.1%

age_labels
Real number (ℝ)

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.7875895
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2023-07-27T09:17:50.052418image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q14
median5
Q36
95-th percentile6
Maximum7
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.2794632
Coefficient of variation (CV)0.26724581
Kurtosis1.8073897
Mean4.7875895
Median Absolute Deviation (MAD)1
Skewness-1.268688
Sum4012
Variance1.6370262
MonotonicityNot monotonic
2023-07-27T09:17:50.261046image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
5 385
45.9%
6 199
23.7%
4 126
 
15.0%
3 41
 
4.9%
1 38
 
4.5%
7 26
 
3.1%
2 23
 
2.7%
ValueCountFrequency (%)
1 38
 
4.5%
2 23
 
2.7%
3 41
 
4.9%
4 126
 
15.0%
5 385
45.9%
6 199
23.7%
7 26
 
3.1%
ValueCountFrequency (%)
7 26
 
3.1%
6 199
23.7%
5 385
45.9%
4 126
 
15.0%
3 41
 
4.9%
2 23
 
2.7%
1 38
 
4.5%

Interactions

2023-07-27T09:17:37.833005image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:26.873808image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:28.724860image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:30.495379image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:32.267497image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:33.968641image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:35.601106image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:38.205193image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:27.272240image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:28.989195image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:30.734791image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:32.497187image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:34.198474image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:35.808269image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:38.601660image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:27.518261image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:29.243584image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:31.001708image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:32.745966image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:34.435409image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:36.079550image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:39.007174image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:27.774825image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:29.500426image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:31.272220image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:33.007236image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:34.673174image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:36.366615image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:39.373534image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:28.017011image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:29.751509image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:31.517371image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:33.254226image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:34.907835image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:36.751344image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:39.746599image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:28.252384image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:29.996716image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:31.758713image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:33.498948image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:35.145403image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:37.108618image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:40.076261image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:28.491145image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:30.247415image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:32.008695image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:33.732809image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:35.374354image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-07-27T09:17:37.458812image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2023-07-27T09:17:50.478835image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
PassengerIdAgeSibSpParchFareFamilySizeage_labelsSurvivedPclassSexEmbarkedSexIndexEmbarkedIndexIsAlone
PassengerId1.0000.034-0.078-0.005-0.028-0.0610.0270.1100.0280.0720.0000.0720.0000.033
Age0.0341.000-0.159-0.2060.116-0.1880.9450.1570.2530.1010.1440.1010.1440.346
SibSp-0.078-0.1591.0000.4470.4420.852-0.1520.1950.1570.2240.0920.2240.0920.839
Parch-0.005-0.2060.4471.0000.4070.796-0.2010.1730.0340.2620.0280.2620.0280.680
Fare-0.0280.1160.4420.4071.0000.5250.1200.3130.4850.2060.1950.2060.1950.306
FamilySize-0.061-0.1880.8520.7960.5251.000-0.1740.2250.1450.2130.0830.2130.0830.635
age_labels0.0270.945-0.152-0.2010.120-0.1741.0000.1320.2470.1010.1130.1010.1130.349
Survived0.1100.1570.1950.1730.3130.2250.1321.0000.3560.5470.1610.5470.1610.218
Pclass0.0280.2530.1570.0340.4850.1450.2470.3561.0000.1460.2420.1460.2420.136
Sex0.0720.1010.2240.2620.2060.2130.1010.5470.1461.0000.0850.9970.0850.320
Embarked0.0000.1440.0920.0280.1950.0830.1130.1610.2420.0851.0000.0851.0000.092
SexIndex0.0720.1010.2240.2620.2060.2130.1010.5470.1460.9970.0851.0000.0850.320
EmbarkedIndex0.0000.1440.0920.0280.1950.0830.1130.1610.2420.0851.0000.0851.0000.092
IsAlone0.0330.3460.8390.6800.3060.6350.3490.2180.1360.3200.0920.3200.0921.000

Missing values

2023-07-27T09:17:40.603629image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-07-27T09:17:41.080453image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

PassengerIdSurvivedPclassNameSexAgeSibSpParchFareEmbarkedSexIndexEmbarkedIndexFamilySizeIsAloneage_labels
0103Braund, Mr. Owen Harrismale22.0000001.007.2500S0.00.02.004.0
1211Cumings, Mrs. John Bradley (Florence Briggs Thayer)female38.0000001.0071.2833C1.01.02.006.0
2313Heikkinen, Miss. Lainafemale26.0000000.007.9250S1.00.01.015.0
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.0000001.0053.1000S1.00.02.006.0
4503Allen, Mr. William Henrymale35.0000000.008.0500S0.00.01.016.0
5603Moran, Mr. Jamesmale29.6991180.008.4583Q0.02.01.015.0
6701McCarthy, Mr. Timothy Jmale54.0000000.0051.8625S0.00.01.016.0
7803Palsson, Master. Gosta Leonardmale2.0000003.0121.0750S0.00.05.001.0
8913Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)female27.0000000.0211.1333S1.00.03.005.0
91012Nasser, Mrs. Nicholas (Adele Achem)female14.0000001.0030.0708C1.01.02.003.0
PassengerIdSurvivedPclassNameSexAgeSibSpParchFareEmbarkedSexIndexEmbarkedIndexFamilySizeIsAloneage_labels
82888112Shelley, Mrs. William (Imanita Parrish Hall)female25.00.0126.0000S1.00.02.005.0
82988203Markun, Mr. Johannmale33.00.007.8958S0.00.01.015.0
83088303Dahlberg, Miss. Gerda Ulrikafemale22.00.0010.5167S1.00.01.014.0
83188402Banfield, Mr. Frederick Jamesmale28.00.0010.5000S0.00.01.015.0
83288503Sutehall, Mr. Henry Jrmale25.00.007.0500S0.00.01.015.0
83388603Rice, Mrs. William (Margaret Norton)female39.00.0529.1250Q1.02.06.006.0
83488702Montvila, Rev. Juozasmale27.00.0013.0000S0.00.01.015.0
83588811Graham, Miss. Margaret Edithfemale19.00.0030.0000S1.00.01.014.0
83689011Behr, Mr. Karl Howellmale26.00.0030.0000C0.01.01.015.0
83789103Dooley, Mr. Patrickmale32.00.007.7500Q0.02.01.015.0